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Abstract 


Recently, a one-parameter extension of the Polak—Rebiére-Polyak method 
has been suggested, having acceptable theoretical features and promising 
numerical behavior. Here, based on an eigenvalue analysis on the method 
with the aim of avoiding a search direction in the direction of the maximum 
magnification by a symmetric version of the search direction matrix, an 
adaptive formula for computing parameter of the method is proposed. Un- 
der standard assumptions, the given formula ensures the sufficient descent 
property and guarantees the global convergence of the method. Numerical 
experiments are done on a collection of CUTEr test problems. They show 
practical effectiveness of the suggested formula for the parameter of the 
method. 
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1 Introduction 


Conjugate gradient (CG) methods can be regarded as the most popular op- 
timization techniques due to their wide applications in the practical fields 
[1, 11, 12, 17]. CG algorithms are advantageous because of affordable mem- 
ory storage, the simple structure of the iterative formula, promising compu- 
tational performance, and acceptable convergence properties [5, 9, 13]. 


General form of an unconstrained optimization problem can be given by 


min f(z), 
where f is a smooth real-valued nonlinear function with the gradient g(z). 
Starting from some point Zp € R”, iterations of the CG algorithms are in 
the form of p41 = © + Sp and sp = agdg, for all k > 0, where ax > 0 is 
a step length often determined by some inexact line search techniques along 
the direction dz calculated by 


do = —go, dk+1 = —Gr+1t+ Bede, k=O, (1) 


in which 6, € R is called the CG parameter and g, = g(x,). Among the 
various classical CG techniques, the Polak—Rebiére-Polyak (PRP) method 
with 
T 
PRP _ Ik+1¥k 
. IIgnl|? ” 


in which yx = grii — ge and ||- || denotes the @2 norm, is regarded as an 
efficiently popular classic method, mainly because of adaptive restarts when 
dealing with improper search directions [9]. 


Although being computationally advantageous, the PRP method fails to 
ensure the descent property [9]. So, significant attention have been paid to 
get descent modifications of the PRP method. For example, Zhang, Zhou, 
and Li [18] developed (ZZL) a three-term extension of the method by 


£ 
94 414k 
do = —go, AREY = —ge+i + Bee dy — ‘nl yk, k 20, (2) 
satisfying the sufficient descent condition, that is, 
dk 9k < —7|lg9xl|?, k 2 0, (3) 


where 7 > 0 is a constant. In another effort, Andrei [3] proposed a spectral 


PRP (SPRP) method with 


_ Th 18k 
Il9x ||? 


Pees qSPRP _ _ SkUk PRP 
0=—9o, dat = Toe|l2oett + Pk Sk 
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which in addition to (3), fulfills the effective Dai—Liao conjugacy condition 
[6]. Also, Babaie-Kafaki and Ghanbari [4] developed a class of one-parameter 
extension of BP RP (EPRP) based on the Dai—Liao approach [6]; that is, 


wie 
ppp _ gop _ Jest (5) 
lon ll? ” 


where ¢ is a positive parameter. Then, to find an optimal choice for t, they 
noted that from (1) and (5) search directions of EPRP can be written as 


doi = —Ak+19k+1; 
where 
k+1 = ; 
llgell? Ngee? 
Symmetrizing Hy+11 by 
ee Hpi + Ais _ ldgyf + yxdt d,dt (6) 
: 2 IIgell? ligel 2 


in light of an eigenvalue analysis, the following family of two-parameter 
choices for t was suggested in [4]: 


2 

a= lle a (3 tfem _ — lel) 

k= , 
gel?” \2[ldall [loud Ul 


(7) 


1 
with p> i and q > —1, guaranteeing the descent condition. 


Following such studies, here we deal with another choice for parameter of 
the EPRP method based on the concept of the maximum magnification by 
a matrix. Organization of our study is summarized as follows. In Section 2, 
after analyzing eigenvalues of P41, we introduce our new formula for the pa- 
rameter t of the EPRP method. Also, we conduct a brief global convergence 
analysis. In Section 3, we make some competitive computational experiments 
on a collection of CUTEr problems, using the Dolan—Moré performance pro- 
file. Finally, concluding remarks are given in Section 4. 


2 An adaptive choice for parameter of the extended 
Polak—Ribiér—Polyak method 


Here, firstly we conduct a concise eigenvalue analysis on P,+1 in order to 
explain our adaptive way of computing t in (5). Hereafter, we assume that 
dt yx > 0, as ensured by the strong Wolfe line search conditions [14], that is, 
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f(vr +andx) — f(ar) < barge de, (8) 
Vf (te + onde) dk| < —ogf de, (9) 


with 0 < 6 < o < 1. The following basic definition is the kernel of our 
analysis. 


Definition 1. [15] For an arbitrary matrix A € R"*™, the scalar 


|| Az| 
mazmag(A) = ——, 
t= ae Tl 
is called the maximum magnification by A. Hence, marmag(A) = ||Al||, and 
also, the vector x 4 0 for which ||Az|| = ||Al| |||], is in the direction of the 


maximum magnification by the matrix A. 


Firstly, note that the matrix P,i1 given by (6) can be regarded as a 
symmetric approximation of the search direction matrix H,41. Based on the 
analysis of [4], eigenvalues of P;,41 are 1 with multiplicity n — 2, and de and 
A;, are given by 


4 il 

AF = 14+ ——~ (tl|dz||? — dF yp 

k 2\Igx||2 ( il i k ) 
1 

+——_,/(t||d;, 2_ dl y, 24+ |\dxl|?||ye|/2 — dF yy 2. 


It can be seen that with the choice (7), we have Ay > 12 A, > 0, and 
consequently, ||P,+41|| = Af. Also, in light of similar analysis carried out 
in [2], the eigenvector of P,+1 corresponding to Ags here called v*, can be 
written as uf = ydy + Vyx in which 


2(1 = Ag )Ilgull? = de ye 
Ila ||? 


v. 


So, uf as a vector in the direction of the maximum magnification by P,+1 is 
specified. 

As explained in [2], when the gradient is approximately parallel with 
the direction of the maximum magnification by Hz+1, then EPRP may face 
with some numerical errors and also, it may converge hardly. Based on this 
fact and since P,41 is a symmetric approximation of H;,+1, it can be stated 
that if g,+1 is as far away as possible from the direction of the maximum 
magnification by Py+1, then the mentioned possible errors may be diminished 
and the convergence may be improved. Hence, we obtain a formula for the 
EPRP parameter by making v¥ to be orthogonal to gy,+1 in the sense of 
solving the equation ge ivt = 0; that is, 


7 [Lynell? Geyer de) = IIdellP (Ge pa Ye)” 


© SE de) (dt vn) Gerdes) —Ildul?GEa0e)) ne) 
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Now, for the sake of positiveness of the EPRP parameter and to achieve the 
sufficient descent property, we suggest the following modified version of (10): 


2 
max {i 0 1p } , if denominator of t, is nonzero, 
Gk 
tt = (11) 
lvl? 
2°? 
I|9xI| 


otherwise, 


1 
Now, similar to the analysis conducted in the proof of [16, Theorem 3.2], 


the following convergence result can be established for the EPRP method. 
The proof is omitted to avoid repetition. 


Theorem 1. Suppose that the level set 2 = {x € R” | f(x) < f(xo)} is 
bounded and in some neighborhood NV of 9, that the objective function f is 
smooth and also, that Vf is Lipschitz continuous. For the EPRP method 
with the parameter (5), assume that ¢ is equal to tj defined by (11) and the 
line search fulfills the strong Wolfe conditions (8) and (9). If there exists 
a positive lower bound a* for the step lengths a, (for all k > 0), then 
Jim. ||gi|| = 0. 


3 Computational experiments 


In this section, we examine the numerical efficiency of the EPRP method 
in which t is computed by (11) with 0 = 0.26 and (7) with (p,q) = (1,0); 
here the corresponding methods are, respectively, called EPRP1 and EPRP2. 
The methods are compared by the two modified PRP methods of ZZL and 
SPRP, respectively, with the search directions (2) and (4) [3, 18]. We have 
implemented all the algorithms on a set of 45 test functions of the CUTEr 
library [8] with n > 50, as given in Table 1, in MATLAB software envi- 
ronment. Hardware and software detailed specifications have been clarified 
in [2], together with the strong Wolfe line search features and the stopping 
criteria. Detailed outputs have been provided in Table 1. 

Efficiency of the algorithms was compared by applying the performance 
profile proposed by Dolan and Moré [7] on the norm of gradient, the CPU 
time (CPUT), and the total number of function and gradient evaluations 
(TNFGE), following the notation of [10]. Results are shown by Figures 1-3. 
As seen, EPRP is preferable to the other methods. Particularly, the results 
show that the choice (11) for the EPRP parameter is practically effective. 
Our experiments showed that averagely in 62.63% of the iterations of EPRP1, 
we had tj, = ty. 
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Figure 1: TNFGE performance profiles 
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4 Conclusion 


Figure 2: CPUT performance profiles 
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Based on the concept of maximum magnification, we have conducted an 
eigenvalue analysis to suggest an optimal choice for the parameter of the 
recently proposed EPRP method. The suggested formula guarantees the 
sufficient descent property as well as the global convergence of the method. 
Effect of the proposed formula has been numerically investigated in contrast 
to several other modifications made on the classical PRP method. Results 
showed the effectiveness of the suggested choice for the EPRP parameter. 


218 


Aminifard and Babaie-Kafaki 


Lt ee ee re ee 


——EPRP1 
—--EPRP? || 
—:—SPRP 


Figure 3: Norm of gradient performance profile 
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